Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add HTTP listener service input plugin #1407

Closed
wants to merge 28 commits into from
Closed

Add HTTP listener service input plugin #1407

wants to merge 28 commits into from

Conversation

bagelswitch
Copy link
Contributor

@bagelswitch bagelswitch commented Jun 23, 2016

The HTTP listener is a service input plugin that listens for messages sent via HTTP POST.
The plugin expects messages in the InfluxDB line-protocol ONLY, other Telegraf input data formats are not supported.
The intent of the plugin is to allow Telegraf to serve as a proxy/router for the /write endpoint of the InfluxDB HTTP API.

Example usage:

curl -i -X POST 'http://localhost:8186/write' --data-binary 'cpu_load_short,host=server01,region=us-west value=0.64 1434055562000000000'

Required for all PRs:

  • CHANGELOG.md updated
  • Sign CLA (if not already signed)
  • README.md updated (if adding a new plugin)

@@ -25,6 +25,7 @@ github.com/gorilla/mux c9e326e2bdec29039a3761c07bece13133863e1e
github.com/hailocab/go-hostpool e80d13ce29ede4452c43dea11e79b9bc8a15b478
github.com/hashicorp/consul 5aa90455ce78d4d41578bafc86305e6e6b28d7d2
github.com/hpcloud/tail b2940955ab8b26e19d43a43c4da0475dd81bdb56
github.com/hydrogen18/stoppableListener dadc9ccc400c712e5a316107a5c462863919e579
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is only ~40 lines of code, you should just copy it in-repo instead of as a dependency

Copy link
Contributor Author

@bagelswitch bagelswitch Jun 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, done!

res.WriteHeader(500)
res.Write([]byte("ERROR parsing metrics"))
}
res.WriteHeader(204)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for these numbers you should use the http package constants, http.StatusNoContent, http.StatusOK, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@kaey
Copy link
Contributor

kaey commented Jul 6, 2016

There is https://github.com/influxdata/influxdb-relay
Is this plugin really needed?

@bagelswitch
Copy link
Contributor Author

@kaey, in our use case the telegraf agent is already co-located with, for example, a containerized application instance in a PaaS - it sits either w/in the instance container, or on the container host, and we use multiple collectors to gather host, middleware, and application metrics from the instance - with this plugin, influx metrics emitters w/in the applications are also just pointed at the local listener - so we get a single stream of metrics covering the complete stack for this application instance that we can direct somewhere.

So in our view, this is an agent/forward-proxy use case, not an HA/reverse-proxy case where influxdb-relay would be appropriate (in fact we will use both, pointing these telegraf instances at balanced influxdb-relay instances for HA).

WriteTimeout: time.Duration(writeTimeout) * time.Second,
}

err = server.Serve(t.listener)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return server.Serve(t.listener)

@bagelswitch
Copy link
Contributor Author

Addressed your comments, @kaey


type HttpListener struct {
ServiceAddress string
ReadTimeout string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type of timeout fields should be time.Duration

@bagelswitch
Copy link
Contributor Author

Addressed your further comments, @kaey

@jackzampolin
Copy link
Contributor

jackzampolin commented Jul 20, 2016

I was able to quickly get this up and running to forward points from influx_stress through telegraf to influxdb.

FWIW I was able to get throughput on the plugin of 65k points a second locally while breaking the 10k point batches I was sending it into 1k point batches for writing to influx. For passing through 10k point batches throughput was closer to 80k points a second.

When writing 10 point batches and outputting 10k point batches, more the design point, testing locally the plugin was easily able to handle 4k requests a second. It also pegged 2 of my CPUs.

I also tested two telegraf instances configured as follows:

$ telegraf -sample-config -input filter http_listener -output-filter influxdb > telegraf1.conf
$ telegraf -sample-config -input filter cpu:mem:disk -output-filter influxdb > telegraf2.conf

And after just changing the output address for the instance collecting cpu:mem:disk the relaying function worked just fine.

I think this looks good! LGTM 👍

@kaey
Copy link
Contributor

kaey commented Jul 21, 2016

LGTM

@superbool
Copy link

why not support json format and when it will be published ? thank you !

@bagelswitch
Copy link
Contributor Author

@superbool - I didn't think it would make sense to support other formats in this particular input, as it's replicating the InfluxDB write endpoint, which only supports the line protocol.

@kostasb
Copy link

kostasb commented Aug 22, 2016

A note that needs to go into documentation for this plugin when/if merged:

When chaining Telegraf instances, CREATE DATABASE requests receive 200 OK with message body `{"results":[]}` but they are not relayed. The edge Telegraf instance, meaning the one that will utterly submit data to InfluxDB, determines the destination database.

The PR currently needs a re-base due to new commits in the master branch.

@swang-pro
Copy link

Great news ! When will this PR be integrated in an official release ?
Thank you

@harlow
Copy link

harlow commented Aug 28, 2016

+1 on getting this into next release

@ddes
Copy link

ddes commented Aug 29, 2016

+1, looks good after some tests. Thanks for your work, I really need it.

@maccman
Copy link

maccman commented Aug 31, 2016

👍 - would love to see this merged

sparrc added a commit that referenced this pull request Sep 6, 2016
also remove locking around adding metrics. Instead, keep a waitgroup on
the ServeHTTP function and wait for that to finish before returning from
the Stop() function

closes #1407
@sparrc
Copy link
Contributor

sparrc commented Sep 6, 2016

@bagelswitch just an FYI that I've made a couple test additions, removed the locks around adding metrics (I had previously done this for the udp and tcp listeners as well), and squashed commits

this PR will be closed once I've merged that (will be done by the end of today) #1715

sparrc added a commit that referenced this pull request Sep 6, 2016
also remove locking around adding metrics. Instead, keep a waitgroup on
the ServeHTTP function and wait for that to finish before returning from
the Stop() function

closes #1407
@sparrc sparrc closed this in 301c79e Sep 6, 2016
@sparrc sparrc added this to the 1.1.0 milestone Sep 7, 2016
jackzampolin pushed a commit that referenced this pull request Oct 7, 2016
also remove locking around adding metrics. Instead, keep a waitgroup on
the ServeHTTP function and wait for that to finish before returning from
the Stop() function

closes #1407
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants